24 research outputs found

    Predicting Sentence-Level Factuality of News and Bias of Media Outlets

    Full text link
    Predicting the factuality of news reporting and bias of media outlets is surely relevant for automated news credibility and fact-checking. While prior work has focused on the veracity of news, we propose a fine-grained reliability analysis of the entire media. Specifically, we study the prediction of sentence-level factuality of news reporting and bias of media outlets, which may explain more accurately the overall reliability of the entire source. We first manually produced a large sentence-level dataset, titled "FactNews", composed of 6,191 sentences expertly annotated according to factuality and media bias definitions from AllSides. As a result, baseline models for sentence-level factuality prediction were presented by fine-tuning BERT. Finally, due to the severity of fake news and political polarization in Brazil, both dataset and baseline were proposed for Portuguese. However, our approach may be applied to any other language

    HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection

    Full text link
    Due to the severity of the social media offensive and hateful comments in Brazil, and the lack of research in Portuguese, this paper provides the first large-scale expert annotated corpus of Brazilian Instagram comments for hate speech and offensive language detection. The HateBR corpus was collected from the comment section of Brazilian politicians' accounts on Instagram and manually annotated by specialists, reaching a high inter-annotator agreement. The corpus consists of 7,000 documents annotated according to three different layers: a binary classification (offensive versus non-offensive comments), offensiveness-level classification (highly, moderately, and slightly offensive), and nine hate speech groups (xenophobia, racism, homophobia, sexism, religious intolerance, partyism, apology for the dictatorship, antisemitism, and fatphobia). We also implemented baseline experiments for offensive language and hate speech detection and compared them with a literature baseline. Results show that the baseline experiments on our corpus outperform the current state-of-the-art for the Portuguese language.Comment: Published at LREC 2022 Proceeding

    Ácidos graxos Ômega-3 e Ômega-6 na nutrição de peixes – fontes e relações

    Get PDF
    There are two series of essential fatty acids, which can not be synthesized by animals or humans and must be supplied in the diet. The n-6 series are derived from linoleic acid (LA) and the n-3 series from alpha linolenic acid (ALN). From these polyunsatured fatty acids (PUFAs) are synthesized the arachidonic acid (AA), eicosapentanoic acid (EPA) and docosahexanoic acid (DHA). Fish are generally good sources of fatty acids, but there are differences between marine and freshwater fishes. Low levels of LA and ALN characterize lipids of marine fish species, as well as high levels of long chain n-3 high-unsaturated fatty acid (HUFA). However, freshwater fishes seem to have greater capacity to desaturate and elongate the fatty acids, synthesized by algi or plants, into EPA and DHA. Fish oil has been utilized for farm fish feeding marine, but it represents a finite resource. With the fishery resource stagnation, the price tends to increase which turns more interesting to search alternative sources. A sustainable source to substitute fish oil in fish feeding includes vegetable oils like rapeseed oil and linseed oil. Research has been carried out to find the best ratio n-3/n-6 in fish muscle. This finding is important to increase fish nutritional value for human health, since the highly unsaturated fatty acids provide various benefits, such as preventing heart diseases, and an increase of their consume should be stimulated.Existem duas séries de ácidos graxos essenciais que não podem ser sintetizados pelos animais e humanos e devem ser supridos pela dieta. A série n-6 é derivada do ácido linoléico (LA) e a série n-3, do ácido alfalinolênico (ALN). A partir destes ácidos graxos polinsaturados (PUFAs- polyunsatured fatty acids), são sintetizados os ácidos araquidônico (AA), eicosapentanóico (EPA) e docosaexanóico (DHA). Os peixes geralmente são importantes fontes de ácidos graxos, de cadeia longa. Contudo, existem diferenças entre espécies marinhas e às de água doce. Peixes marinhos são caracterizados por baixos níveis de LA e ALN, mas com altos níveis de ácidos graxos altamente inssaturados (HUFA) de cadeia longa n-3, quando comparados com os peixes de água doce. Entretanto, peixes de água doce parecem ter uma maior capacidade de elongar e dessaturar ácidos graxos, sintetizados por algas ou plantas em EPA e DHA. Para a nutrição de peixes cultivados, utiliza-se o óleo de peixe marinho, mas, este representa um recurso finito de pesca. Com a estagnação dos recursos pesqueiros, seu preço tende a subir, tornando-se cada vez mais interessante procurar fontes alternativas para este ingrediente. Entre as alternativas sustentáveis para substituir o óleo de peixe está a inclusão óleos de vegetais, como linhaça e canola na ração. Para tanto, pesquisas vêm sendo realizadas, com o intuito de alcançar melhores proporções de n-3/n-6 no músculo do peixe. Dessa forma, se potencializa o valor nutricional para a saúde humana, já que os ácidos graxos altamente insaturados proporcionam diversos benefícios a ela, como prevenção de doenças, devendo, portanto, ter seu consumo aumentado

    Extended Multilingual Protest News Detection -- Shared Task 1, CASE 2021 and 2022

    Get PDF
    We report results of the CASE 2022 Shared Task 1 on Multilingual Protest Event Detection. This task is a continuation of CASE 2021 that consists of four subtasks that are i) document classification, ii) sentence classification, iii) event sentence coreference identification, and iv) event extraction. The CASE 2022 extension consists of expanding the test data with more data in previously available languages, namely, English, Hindi, Portuguese, and Spanish, and adding new test data in Mandarin, Turkish, and Urdu for Sub-task 1, document classification. The training data from CASE 2021 in English, Portuguese and Spanish were utilized. Therefore, predicting document labels in Hindi, Mandarin, Turkish, and Urdu occurs in a zero-shot setting. The CASE 2022 workshop accepts reports on systems developed for predicting test data of CASE 2021 as well. We observe that the best systems submitted by CASE 2022 participants achieve between 79.71 and 84.06 F1-macro for new languages in a zero-shot setting. The winning approaches are mainly ensembling models and merging data in multiple languages. The best two submissions on CASE 2021 data outperform submissions from last year for Subtask 1 and Subtask 2 in all languages. Only the following scenarios were not outperformed by new submissions on CASE 2021: Subtask 3 Portuguese \& Subtask 4 English.Comment: To appear in CASE 2022 @ EMNLP 202

    Arachnids of medical importance in Brazil: main active compounds present in scorpion and spider venoms and tick saliva

    Get PDF

    Semantic clustering of aspects for opinion mining

    No full text
    Com o rápido crescimento do volume de informações opinativas na web, extrair e sintetizar conteúdo subjetivo e relevante da rede é uma tarefa prioritária e que perpassa vários domínios da sociedade: político, social, econômico, etc. A organização semântica desse tipo de conteúdo, é uma tarefa importante no contexto atual, pois possibilita um melhor aproveitamento desses dados, além de benefícios diretos tanto para consumidores quanto para organizações privadas e governamentais. A área responsável pela extração, processamento e apresentação de conteúdo subjetivo é a mineração de opinião, também chamada de análise de sentimentos. A mineração de opinião é dividida em níveis de granularidade de análise: o nível do documento, o nível da sentença e o nível de aspectos. Neste trabalho, atuou-se no nível mais fino de granularidade, a mineração de opinião baseada em aspectos, que consiste de três principais tarefas: o reconhecimento e agrupamento de aspectos, a extração de polaridade e a sumarização. Aspectos são propriedades do alvo da opinião e podem ser implícitos e explícitos. Reconhecer e agrupar aspectos são tarefas críticas para mineração de opinião, no entanto, também são desafiadoras. Por exemplo, em textos opinativos, usuários utilizam termos distintos para se referir a uma mesma propriedade do objeto. Portanto, neste trabalho, atuamos no problema de agrupamento de aspectos para mineração de opinião. Para resolução deste problema, optamos por uma abordagem baseada em conhecimento linguístico. Investigou-se os principais fenômenos intrínsecos e extrínsecos em textos opinativos a fim de encontrar padrões linguísticos e insumos acionáveis para proposição de métodos automáticos de agrupamento de aspectos correlatos para mineração de opinião. Nós propomos, implementamos e comparamos seis métodos automáticos baseados em conhecimento linguístico para a tarefa de agrupamento de aspectos explícitos e implícitos. Um método inédito foi proposto para essa tarefa que superou os demais métodos implementados, especialmente o método baseado em léxico de sinônimos (baseline) e o modelo estatístico com base em word embeddings. O método proposto também não é dependente de uma língua ou de um domínio, no entanto, focamos no Português do Brasil e no domínio de produtos da web.With the growing volume of opinion information on the web, extracting and synthesizing subjective and relevant content from the web has to be shown a priority task that passes through different society domains, such as political, social, economical, etc. The semantic organization of this type of content is very important nowadays since it allows a better use of those data, as well as it benefits customers and both private and governmental organizations. The area responsible for extracting, processing and presenting the subjective content is opinion mining, also known as sentiment analysis. Opinion mining is divided into granularity levels: document, sentence and aspect levels. In this research, the deepest level of granularity was studied, the opinion mining based on aspects, which consists of three main tasks: aspect recognition and clustering, polarity extracting, and summarization. Aspects are the properties and parts of the evaluated object and it may be implicit or explicit. Recognizing and clustering aspects are critical tasks for opinion mining; nonetheless, they are also challenging. For example, in reviews, users use distinct terms to refer to the same object property. Therefore, in this work, the aspect clustering task was the focus. To solve this problem, a linguistic approach was chosen. The main intrinsic and extrinsic phenomena in reviews were investigated in order to find linguistic standards and actionable inputs, so it was possible to propose automatic methods of aspect clustering for opinion mining. In addition, six automatic linguistic-based methods for explicit and implicit aspect clustering were proposed, implemented and compared. Besides that, a new method was suggested for this task, which surpassed the other implemented methods, specially the synonym lexicon-based method (baseline) and a word embeddings approach. This suggested method is also language and domain independent and, in this work, was tailored for Brazilian Portuguese and products domain

    Queen of the Sea: Brazil in the cultural production of the Clara Nunes in the 1970s

    No full text
    This paper presents an interpretation about the representations about Brazil in the cultural production of the singer Clara Nunes in the 1970s. As documents for to construct this narrative we used the songs interpreted, the photo of albums, and reports of Revista Veja. We try understanding how an artistic production, focused on the market cultural, in a Brazil strongly Eurocentric, exalt the cultures of African origin, became a national success, with good international attention in the 1970s.Este trabalho apresenta uma interpretação acerca das representações sobre o Brasil na produção cultural da cantora Clara Nunes nos anos de 1970. Como documentos para construir esta narrativa utilizamos as canções interpretadas, as fotos dos álbuns e as reportagens da Revista Veja. Intentamos compreender como uma produção artística, focada no mercado cultural, em um Brasil fortemente eurocêntrico, enaltecia as culturas de matriz africana, tornando-se um sucesso nacional, com boa repercussão internacional na década de 1970.Coordenação de Aperfeiçoamento de Pessoal de Nível Superio
    corecore